This report contains the workflow for calculating the hydrological analysis Suwannee River basin originally developed by Matthew Cohen. The data used in this report come from:
the PRISM dataset, which is a high-resolution gridded dataset of temperature and precipitation. The data is available at a 4km resolution and is available for the entire United States. The monthly data is available from 1891 to the present, but the data used here are daily from 1981 to the present. The data is available for download from the PRISM website.
the Lake City rainfall gage, which is the best long-record gage in the basin.
Additional rainfall data from long-term gages at Mayo, Madison, High Springs, and Live Oak, which have variable long term data, with some missing data throughout. These come from the Florida Climate Center.
the USGS streamflow data for 8 gages in the Suwannee River basin. The data is available from the USGS website.
The workflow is divided into the following sections:
Load data
Calculate cumulative anomalies
Calculate the flow gain and loss for each subwatershed
Analyze the flow loss, or reversal, patterns for each subwatershed
2 Load data
2.1 GIS data
The first step is to get the GIS layers. This is how we get the PRISM data linked to the subwatershed polygons. I downloaded the HUC8 shapefiles for the Suwannee River basin from the NHDPlus website. The shapefile contains the 8 subwatershed polygons for the Suwannee River basin. I then converted the shapefile to the same CRS as the PRISM data. The code to do this is below, but is commented out as it has already been run and takes some time.
Code
# All of this is commented out now to save on run time# # Load the 4 most upstream shapefiles# upper_suw <- st_read(here("data", "gis", "NHD_H_03110201_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# alapaha <- st_read(here("data", "gis", "NHD_H_03110202_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# withla <- st_read(here("data", "gis", "NHD_H_03110203_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# little <- st_read(here("data", "gis", "NHD_H_03110204_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# # # Combine the withla and little shapefiles to be withla# withla <- st_union(withla, little) %>%# # sum all the areas of the huc10s (areasqkm), each time a union is made, # #it adds a ".x" where x is the number of the union# mutate(areasqkm = sum(areasqkm, areasqkm.1)) %>%# # now drop all the other columns containing a "." in the name# dplyr::select(-contains("."))# # # Load the santa fe and lower suwannee# sf <- st_read(here("data", "gis", "NHD_H_03110206_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# lower_suw <- st_read(here("data", "gis", "NHD_H_03110205_HU8_Shape", "Shape", "WBDHU8.shp")) %>%# st_transform(crs = 4269)# # # Just the get the huc10s for the lower suwannee so can split into middle and lower# lower_suw_huc10 <- st_read(here("data", "gis", "NHD_H_03110205_HU8_Shape", "Shape", "WBDHU10.shp")) %>%# st_transform(crs = 4269)# # # Merge the first four huc10 polygons into a single polygon, call that the middle suwannee# middle_suw <- st_union(lower_suw_huc10[1,], lower_suw_huc10[2,]) %>%# st_union(st_union(lower_suw_huc10[3,], lower_suw_huc10[4,])) %>%# # sum all the areas of the huc10s (areasqkm), each time a union is made, # #it adds a ".x" where x is the number of the union# mutate(areasqkm = sum(areasqkm, areasqkm.1, areasqkm.2, areasqkm.1.1)) %>%# # now drop all the other columns containing a "." in the name# dplyr::select(-contains(".")) %>%# mutate(name = "Middle Suwannee")# # # Merge the last two huc10 polygons into a single polygon, call that the lower suwannee# lower_suw <- st_union(lower_suw_huc10[5, ], lower_suw_huc10[6, ]) %>%# st_union(st_union(lower_suw_huc10[7, ], lower_suw_huc10[8, ])) %>%# # sum all the areas of the huc10s (areasqkm), each time a union is made, # #it adds a ".x" where x is the number of the union# mutate(areasqkm = sum(areasqkm, areasqkm.1, areasqkm.2, areasqkm.1.1)) %>%# # now drop all the other columns containing a "." in the name# dplyr::select(-contains(".")) %>%# mutate(name = "Lower Suwannee")# # # get all the shapefiles together# all <- bind_rows(upper_suw, alapaha, withla, sf, middle_suw, lower_suw)# # # Save this data for use later# saveRDS(all, file = here("data", "gis", "suwannee_subwatersheds.RDS"))
Now we can load the GIS layers and take a look to make sure they are correct (Figure 1).
Code
# Load the suwannee subwatershedsall <-readRDS(here("data", "gis", "suwannee_subwatersheds.RDS"))# discharge gage numbersgages <-c("02319000", "02317500", "02315500", "02319500", "02320500", "02321500", "02322500", "02323500")# Get the locations of the gages relevant USGS gagesusgs_meta <-readNWISsite(gages)# load the springs shapefile, this comes from the Suwannee River Water Management District sitesprings <-st_read(here("data", "gis", "springs_SRWMD_24k", "sprsrwmd24.shp")) %>%st_transform(crs =4269)# only want the springs that are within the watershedsprings <-st_intersection(springs, all)# Load the florida shapefilefl <-st_as_sf(maps::map("state", regions ="florida", fill=TRUE, plot =FALSE))# now get the flowlines for the suwannee, alapaha, withla, and sfsuwannee_flowline <-st_read(here("data", "gis", "NHD_H_03110201_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Suwannee River")alapaha_flowline <-st_read(here("data", "gis", "NHD_H_03110202_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Alapaha River")withla_flowline <-st_read(here("data", "gis", "NHD_H_03110203_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Withlacoochee River")little_flowline <-st_read(here("data", "gis", "NHD_H_03110204_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Little River")sf_flowline <-st_read(here("data", "gis", "NHD_H_03110206_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Santa Fe River")sw_flowline <-st_read(here("data", "gis", "NHD_H_03110205_HU8_Shape", "Shape", "NHDFlowline.shp")) %>%st_transform(crs =4269) %>%st_zm(drop =TRUE, what ="ZM") %>%filter(gnis_name =="Suwannee River")# merge all the flowlines to one geometry for easier plottingflowlines <-bind_rows(suwannee_flowline, alapaha_flowline, withla_flowline, little_flowline, sf_flowline, sw_flowline)# Plot of the watersheds, flowlines, gages, and springs with north arrow and scale bar# and legendp_main <-ggplot() +geom_sf(data = all) +geom_sf(data = flowlines, aes(color ="Major rivers"), size =0.5) +geom_sf(data = springs, aes(fill ="Springs"), size =1, shape =21,color ="black") +geom_point(data = usgs_meta, aes(x = dec_long_va, y = dec_lat_va, fill ="USGS gages"), color ="black", size =2, shape =21) + ggrepel::geom_text_repel(data = usgs_meta, aes(x = dec_long_va, y = dec_lat_va, label = site_no),color ="black", size =3) +# add text for each subwatershedgeom_sf_text(data =st_centroid(all), aes(label = name),color ="black", size =3, nudge_x =c(0.05, 0, 0, 0, -0.5, -0.5),nudge_y =c(0.3, 0.4, 0.2 ,0.12, 0, 0)) +scale_color_manual(values ="black") +scale_fill_manual(values =c("lightblue", "white")) +annotation_north_arrow(location ="bl", which_north ="true", pad_x =unit(0.1, "in"), pad_y =unit(0.1, "in"), style = north_arrow_fancy_orienteering) +annotation_scale(location ="br", width_hint =0.4, text_cex =0.8) +theme(axis.title =element_blank(),panel.background =element_rect(fill ="white"),legend.title =element_blank(),legend.position =c(0.85, 0.12),legend.spacing.y =unit(-0.5, "cm"),legend.background =element_rect(fill ="transparent")) +labs(title ="Suwannee River subwatersheds",caption ="Data from USGS NWIS, NHDPlus, and SRWMD",x ="",y ="")# add an inset of the florida map with the suwannee watershed# dissolve the internal subwatershed boundariesp_inset <-ggplot() +geom_sf(data = fl) +geom_sf(data =st_union(all)) +geom_sf(data = flowlines, color ="black", size =0.3) +theme_minimal() +theme(axis.title =element_blank(),legend.title =element_blank(),axis.text =element_blank(),panel.grid =element_blank(),panel.background =element_rect(fill ="white"))
Figure 1: Map of study area with subwatersheds, gages, springs, and flowlines.
2.2 Rainfall data
Now we can link the PRISM data to each subwatershed to get the daily rainfall from 1981–2022. The first step is to download the PRISM data. The code to to do this is below, but is commented out as it has already been run and takes some time.
Code
# # Download daily PRISM data since 1981 ------------------------------------# # Set the prism download folder# prism_set_dl_dir(here("data", "prism"))#--- download PRISM precipitation data ---## Only want to do this once because it's huge and takes forever# commented out after downloaded # get_prism_dailys(# type = "ppt",# minDate = "1981-01-01",# maxDate = "2022-12-31",# keepZip = FALSE# )# # Get the PRISM data for the subwatersheds --------------------------------# # Again only want to do this once because it takes a while to load, code # # here for reproducibility# # Create a stack of rasters for all the prism data# prism_rasters <- pd_stack(prism_archive_ls())# # # Load the Suwannee River basin shapefile with its 8 subwatershed polygons# # and convert this to the same CRS as the prism data# suw_sf <- readRDS(here("data", "gis", "suwannee_subwatersheds.RDS")) %>%# st_transform(crs = terra::crs(prism_rasters[[1]]))# # # Get the prism data by subwatershed, each prism point also contains the coverage fraction# ppt_stack <- exact_extract(prism_rasters, suw_sf, progress = F)# # # Combine the extracted data into a single data frame, id is the subwatershed number# ppt <- bind_rows(ppt_stack, .id = "id")# # # Save these so don't have to load again# saveRDS(ppt, here("data", "gis", "suwannee_prism_daily_ppt.RDS"))# saveRDS(prism_rasters, here("data", "gis", "prism_daily_raster_stack.RDS"))
Now we link the PRISM data with each watershed shapefile, weighting the data by the “coverage fraction” that is included in the PRISM data. This is done in the code below.
Code
# Load the previously calculated Suwannee River basin shapefile with its 8 subwatershed polygons# and convert this to the same CRS as the prism data (NAD83)suw_sf <-readRDS(here("data", "gis", "suwannee_subwatersheds.RDS")) %>%#st_transform(crs =4269)# Get the PRISM data in usable format --------------------------------# read in the prim datappt <-readRDS(here("data", "gis", "suwannee_prism_daily_ppt.RDS"))# pivot to longer format, extract year and month and day from the column names# column names are in the form: "PRISM_ppt_stable_4kmM3_YYYYMM_bil" (this was for monthly)# column names are in the form: "PRISM_ppt_stable_4kmD1_YYYYMMDD_bil" (this is for daily)ppt_df <- ppt %>%pivot_longer(cols =-c(id, coverage_fraction), names_to ="year_month_day", values_to ="ppt") %>%mutate(year =as.integer(str_sub(year_month_day, 24, 27)),month =as.integer(str_sub(year_month_day, 28, 29)),day =as.integer(str_sub(year_month_day, 30, 31)))# Next, filter out rows with NA in month column because those are yearly summaries# from before 1981, and then# calculate the coverage-weighted mean of ppt for each subwatershed and monthppt_df <- ppt_df %>%filter(!is.na(month)) %>%group_by(id, year, month, day) %>%summarise(ppt_aw =sum(ppt * coverage_fraction) /sum(coverage_fraction)) %>%mutate(id =as.numeric(id))# Link that summary data back with the polygonssuw_final <- suw_sf %>%mutate(id :=seq_len(nrow(.))) %>%left_join(., ppt_df, by ="id")
Let’s take a look at average annual rainfall for each subwatershed from 1981–2022 (Figure 2).
Code
# Caclulate annual rainfall each year by watershedppt_yr <-st_drop_geometry(suw_final) %>%group_by(name, areasqkm, id, year) %>%summarise(ppt_ann =sum(ppt_aw)) %>%ungroup() # Plot, arrange the bars by HUCp_ppt_ann <-ggplot(ppt_yr, aes(x =fct_relevel(name, "Lower Suwannee", "Santa Fe", "Middle Suwannee", "Withlacoochee", "Alapaha", "Upper Suwannee"), y = ppt_ann)) +stat_summary(fun = mean, geom ="bar") +stat_summary(fun.data = mean_sdl, fun.args =list(mult =1), geom ="errorbar", width =0.2) +# y-axis minimum of 1000coord_cartesian(ylim =c(1000, NA)) +# scale_fill_viridis_c() +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1)) +labs(x ="Subwatershed", y ="Average Annual Precipitation (mm)")
Figure 2: Mean annual rainfall (1981–2022) by subwatershed with standard deviation as error bars.
Now, for the years before 1981, we will use the Lake City rainfall gage data (observed data). Matt has already downloaded this and provided it. We will load the data, plot it (Figure 3 (a)), and calculate and plot the exceedance probability to get an idea of “big” and “small” storms (Figure 3 (b)).
Code
# Load the observed rainfall data and some analysis -----------------------# load the measured Lake City rainfall datalc_rain <- readxl::read_xlsx(here("data", "lake_city_rain.xlsx"))# Plot the daily rainfall at Lake Cityp_lc_rain <-ggplot(lc_rain, aes(x =as.Date(Date), y =`LCRain (mm)`)) +geom_line() +scale_x_date(date_labels ="%Y", date_breaks ="10 years") +theme_classic() +theme(axis.title.x =element_blank()) +labs(y =expression("observed rainfall at Lake City "*(mm~d^{-1})))p_lc_rain# Calculate the exceedance probability for the Lake City rainfall data# need to first remove all zero rain dayslc_rain_exc <-rename(lc_rain, date = Date, ppt =`LCRain (mm)`) %>%filter(ppt >0) %>%arrange(ppt) %>%mutate(rank =row_number(),exceedance =1- rank /n())# create a dataframe for plotting line segments at the closest exceedances to # 10% and 20% with the corresponding rainfall values. need to use slice_min# for each exceedance levellc_rain_exc_lines <- lc_rain_exc %>%slice_min(abs(exceedance -0.1)) %>%bind_rows(lc_rain_exc %>%slice_min(abs(exceedance -0.2))) %>%bind_rows(lc_rain_exc %>%slice_min(abs(exceedance -0.501))) %>%mutate(ppt =round(ppt, 1))# plot the exceedance probability for the Lake City rainfall data on log scale, p_lc_exc <-ggplot(lc_rain_exc, aes(x = exceedance, y = ppt)) +geom_line() +scale_x_reverse(expand =expansion(mult =c(0, 0.01)),breaks =seq(0,1,0.1)) +scale_y_log10(expand =expansion(mult =c(0, 0))) +# add the lines and labels from the dataframegeom_segment(data = lc_rain_exc_lines, aes(x = exceedance, xend = exceedance, y =0, yend = ppt), linetype ="dashed") +geom_segment(data = lc_rain_exc_lines, aes(x =1, xend = exceedance, y = ppt, yend = ppt), linetype ="dashed") +geom_text(data = lc_rain_exc_lines, aes(label = ppt, x = exceedance, y = ppt),nudge_y =0.1) +annotation_logticks(sides ="l") +theme_classic() +labs(x ="exceedance probability", y =expression("rainfall "(mm~d^{-1})))p_lc_exc
(a) Daily rainfall (1931–2022)
(b) Rainfall exceedance probability (1931–2022) at Lake City, with 50%, 20%, and 10% probabilities shown.
Figure 3: Lake City rainfall summary.
Let’s also take a look at the other long term rain gages, to check if they are similar to Lake City. We will plot the daily data for each gage (Figure 4).
Code
# load the measured rain data from Madison, Mayo, Live Oak, and White Springs,# these are all files that were downloaded from the Florida Climate Center website# they are located in "data" "fcc" folder as .csv files# the first part of the file name before the underscore is the site, want to use thatrain_meas <-list.files(here("data", "fcc"), full.names = T) %>%map_dfr(~read_csv(.x) %>%mutate(site =str_extract(basename(.x), "^[^_]+"))) %>%# add a leading 0 before the month and day columsn if only one digitmutate(MONTH =str_pad(MONTH, 2, pad ="0"),DAY =str_pad(DAY, 2, pad ="0")) %>%mutate(date =ymd(paste0(YEAR, MONTH, DAY)),ppt =if_else(precipitation <0, NA_real_, precipitation *25.4)) %>%#get into mm from inchesselect(site, date, ppt)# Get all that data togetherrain_meas <- rain_meas %>%bind_rows(lc_rain %>%select(date = Date, ppt =`LCRain (mm)`) %>%mutate(site ="lakeCity"))# Plot the daily rainfall at all gagesp_rain_meas <-ggplot(rain_meas, aes(x =as.Date(date), y = ppt)) +geom_line() +facet_wrap(~site) +scale_x_date(date_labels ="%Y", date_breaks ="20 years") +theme_classic() +theme(axis.title.x =element_blank()) +labs(y =expression("observed rainfall "*(mm~d^{-1})))ggplotly(p_rain_meas)